{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "private_outputs": true,
      "provenance": [],
      "authorship_tag": "ABX9TyNCFK+y9Vr8VxmP/Qs5FYzO",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/ktynski/Marketing_Automations_Notebooks_With_GPT/blob/main/Automatic_Intent_Prediction_from_Keyword_%2B_Serps_with_Automatic_Article_Title_Description_Generation_Based_on_Intent_Predictions_(Public).ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "1. Install the required libraries:"
      ],
      "metadata": {
        "id": "h5VQynWcqtWq"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!pip install openai\n",
        "!pip install google-search-results"
      ],
      "metadata": {
        "id": "4XthS5jEwPgq"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "2. Import required libraries"
      ],
      "metadata": {
        "id": "GFa9Nu5FwSbC"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "WvKY4xmLqqEb"
      },
      "outputs": [],
      "source": [
        "import openai\n",
        "import re\n",
        "import pandas as pd\n",
        "import requests\n",
        "from serpapi import GoogleSearch"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "2. Define the intent rubric."
      ],
      "metadata": {
        "id": "16BTjCBiqxtg"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "topic = \"Enter your topic here\"\n",
        "intents = [\n",
        "    \"Given the search engine result page for the search query, classify the user's intent into one of the following 5 categories: \\n Informational: The user is looking for general information on a topic.\\n Navigational: The user is looking to find a specific website. \\n Transactional: The user is looking to complete a transaction, such as purchasing a product or booking a service. \\n Commercial Investigation: The user is comparing products or services or researching options for a potential purchase. \\n Local: The user is looking for information about a specific local business or service. \\n Please provide a brief explanation of your reasoning, justify your classification decision, and estimate a confidence score (between 0 and 1) for each category, indicating how confident you are in your prediction.\"\n",
        "\n",
        "]"
      ],
      "metadata": {
        "id": "OKzhaff2vtn_"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "3. Initialize the OpenAI API key and the model name:"
      ],
      "metadata": {
        "id": "0dLuDpt3rFUP"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Initialize the OpenAI API key\n",
        "openai.api_key = \"Your OpenAI api key\"\n",
        "model_engine = \"text-davinci-003\""
      ],
      "metadata": {
        "id": "uk_AoWBYq7hQ"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "4. Define a function that takes the topic and generates a list of high-intent types of searches for that subtopic. The more you ask for the more API calls will be done, but feel free to change 10 to any number."
      ],
      "metadata": {
        "id": "Th33Ipe-tAJK"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "def generate_subtopics(topic):\n",
        "    completions = openai.Completion.create(\n",
        "        engine=model_engine,\n",
        "        prompt=\"Provide a list of 10 general search query categories related to \" + topic + \" that likely have high intent.\",\n",
        "        max_tokens=400,\n",
        "        n=1,\n",
        "        stop=None,\n",
        "        temperature=0.7,\n",
        "    )\n",
        "\n",
        "    message = completions.choices[0].text\n",
        "    subtopics = message.strip().split(\"\\n\")\n",
        "    subtopics = \"\\n\".join(subtopics)\n",
        "\n",
        "    nonumbersubtopics= re.sub(r\"^\\d+\\.\", \"\", subtopics, flags=re.MULTILINE)\n",
        "\n",
        "    subtopics = nonumbersubtopics.split(\"\\n\")\n",
        "    print(subtopics)\n",
        "    return subtopics\n",
        "\n",
        "#subtopics = generate_subtopics(\"fishing\")"
      ],
      "metadata": {
        "id": "oNoR4tzntIaX"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "5. Generate keywords for each subtopic. The more you ask for the more API calls will be done, but feel free to change 10 to any number.\n"
      ],
      "metadata": {
        "id": "-_NoAoYhrIZn"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "def generate_keywords(subtopic):\n",
        "    completions = openai.Completion.create(\n",
        "        engine=model_engine,\n",
        "        prompt=\"Provide a List of 10 representative high intent search keywords for \" + subtopic,\n",
        "        max_tokens=400,\n",
        "        n=1,\n",
        "        stop=None,\n",
        "        temperature=0.7,\n",
        "    )\n",
        "\n",
        "    message = completions.choices[0].text\n",
        "    keywords = message.strip().split(\"\\n\")\n",
        "    keywords = \"\\n\".join(keywords)\n",
        "\n",
        "    nonumbertext = re.sub(r\"^\\d+\\.\", \"\", keywords, flags=re.MULTILINE)\n",
        "\n",
        "    keywords = nonumbertext.split(\"\\n\")\n",
        "\n",
        "    print(keywords)\n",
        "    return keywords\n",
        "\n",
        "#keywords = generate_keywords(\"fishing tackle\")"
      ],
      "metadata": {
        "id": "U5HpdOV3rHmX"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "6. Get SerpAPI Data"
      ],
      "metadata": {
        "id": "N2YUm2ynz_PL"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "def get_serp_data(keyword):\n",
        "    api_key = \"Your SerpAPI api key\"\n",
        "    params = {\n",
        "        \"engine\": \"google\",\n",
        "        \"q\": keyword,\n",
        "        \"api_key\": api_key\n",
        "    }\n",
        "    search = GoogleSearch(params)\n",
        "    results = search.get_dict()\n",
        "    serp_data = results[\"organic_results\"]\n",
        "\n",
        "    titles = \", \".join([title for title in [d.get(\"title\", None) for d in serp_data] if title is not None])\n",
        "    links = \", \".join([link for link in [d.get(\"link\", None) for d in serp_data] if link is not None])\n",
        "    displayed_results = \", \".join([str(displayed_result) for displayed_result in [d.get(\"displayed_results\", None) for d in serp_data] if displayed_result is not None])\n",
        "    snippet_highlighted_words = \", \".join([str(snippet_highlighted_word) for snippet_highlighted_word in [d.get(\"snippet_highlighted_words\", None) for d in serp_data] if snippet_highlighted_word is not None])\n",
        "\n",
        "    finalresult =  keyword + \", \" + titles + \", \" + links + \", \" + displayed_results + \", \" + snippet_highlighted_words\n",
        "    print(finalresult)\n",
        "    return finalresult\n",
        "\n",
        "\n"
      ],
      "metadata": {
        "id": "6z9Ee443KUzx"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "7. Define a function that takes a keyword and the intent rubric, and returns a dictionary with the scores for each category in the rubric:"
      ],
      "metadata": {
        "id": "Wzd_axahrOaG"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import concurrent.futures\n",
        "\n",
        "def get_serp_data_concurrent(keyword):\n",
        "    return get_serp_data(keyword)\n",
        "\n",
        "def evaluate_keyword_concurrent(keyword, intent):\n",
        "    keywordserps = None\n",
        "    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:\n",
        "        future_keywordserps = executor.submit(get_serp_data_concurrent, keyword)\n",
        "        keywordserps = future_keywordserps.result()\n",
        "\n",
        "    completions = openai.Completion.create(\n",
        "        engine=model_engine,\n",
        "       prompt=f\"You are an expert at predicting search intent when given both the search phrase used, and the search results from Google for that phrase. Given the intent rubric: \\n '{intents}' \\n Please try the example below: \\n Search Keyword: '{keyword}' \\n SERP Data:: '{keywordserps}',\\n Please provide the intent prediction as well as a brief overview of a description of the intent as you understand it from the SERP data. \\n Provide the result in the following format: [Intent Category / Intent Confidence Score / Detailed Explanation] Intent Prediction:\",\n",
        "        max_tokens=800,\n",
        "        n=1,\n",
        "        stop=None,\n",
        "        temperature=0.7,\n",
        "    )\n",
        "\n",
        "    message = completions.choices[0].text\n",
        "    #first_word = message.strip().split(\" \")[0]\n",
        "\n",
        "\n",
        "    return message\n",
        "\n",
        "# Generate the subtopics and keywords\n",
        "subtopics = generate_subtopics(topic)\n",
        "keyword_lists = [generate_keywords(subtopic) for subtopic in subtopics]\n",
        "\n",
        "# Evaluate the keywords and create a dataframe\n",
        "rows = []\n",
        "with concurrent.futures.ThreadPoolExecutor(max_workers=32) as executor:\n",
        "    future_to_row = {executor.submit(evaluate_keyword_concurrent, keyword, intent): (subtopic, keyword, intent) for i, subtopic in enumerate(subtopics) for keyword in keyword_lists[i] for intent in intents}\n",
        "    for future in concurrent.futures.as_completed(future_to_row):\n",
        "        subtopic, keyword, intent = future_to_row[future]\n",
        "        score = future.result()\n",
        "        rows.append([subtopic, keyword]+[score if x == intent else None for x in intents])\n",
        "\n",
        "df = pd.DataFrame(rows, columns=[\"Subtopic\", \"Keyword\", \"Intent\"])\n",
        "\n",
        "\n",
        "# Export the dataframe to a CSV file\n",
        "df.to_csv(\"Intent_Prediction.csv\", index=False)\n",
        "\n"
      ],
      "metadata": {
        "id": "jP-4TPqIqHTG"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "8. Check the Dataframe and make sure things look good:"
      ],
      "metadata": {
        "id": "nLVd4_BoVHON"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df"
      ],
      "metadata": {
        "id": "VjTMJo2FE5SA"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "9. Generate Article Titles and Outlines Given the Intent"
      ],
      "metadata": {
        "id": "Qe0KR6OFXHvP"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import concurrent.futures\n",
        "import re\n",
        "\n",
        "def generate_outlines(intent):\n",
        "    completions = openai.Completion.create(\n",
        "        engine=model_engine,\n",
        "        prompt=f\"For the given intent descriptions: \\n '{intent}' Provide an article title and article outline.\",\n",
        "        max_tokens=1500,\n",
        "        n=1,\n",
        "        stop=None,\n",
        "        temperature=0.7,\n",
        "    )\n",
        "\n",
        "    message = completions.choices[0].text\n",
        "    titleandoutline = message\n",
        "\n",
        "    print(titleandoutline)\n",
        "    return titleandoutline\n",
        "\n",
        "df[\"Article Ideas\"] = \"\"\n",
        "\n",
        "# Create an executor and submit tasks to the executor\n",
        "with concurrent.futures.ThreadPoolExecutor() as executor:\n",
        "    future_to_index = {executor.submit(generate_outlines, row[\"Intent\"]): index for index, row in df.iterrows()}\n",
        "    for future in concurrent.futures.as_completed(future_to_index):\n",
        "        index = future_to_index[future]\n",
        "        article_idea = future.result()\n",
        "        df.at[index, \"Article Ideas\"] = article_idea\n",
        "\n",
        "# Save the updated dataframe to a file\n",
        "df.to_csv(\"Intent_and_Article_Suggestions.csv\", index=False)"
      ],
      "metadata": {
        "id": "4xY0rCXiXMCO"
      },
      "execution_count": null,
      "outputs": []
    }
  ]
}